Here I study how biomass density changes across treatments in the PatchSizePilot. In particular, I’m studying how biomass density changes across:
culture_info = read.csv(here("data", "PatchSizePilot_culture_info.csv"), header = TRUE)
datatable(culture_info[,1:10],
rownames = FALSE,
options = list(scrollX = TRUE),
filter = list(position = 'top',
clear = FALSE))
### --- IMPORT --- ###
load(here("data", "population", "t0.RData")); t0 = pop_output
load(here("data", "population", "t1.RData")); t1 = pop_output
load(here("data", "population", "t2.RData")); t2 = pop_output
load(here("data", "population", "t3.RData")); t3 = pop_output
load(here("data", "population", "t4.RData")); t4 = pop_output
load(here("data", "population", "t5.RData")); t5 = pop_output
load(here("data", "population", "t6.RData")); t6 = pop_output
load(here("data", "population", "t7.RData")); t7 = pop_output
rm(pop_output)
### --- TIDY --- ###
#Column: time
t0$time = NA
t1$time = NA
#Column: replicate_video
t0$replicate_video = 1:12 #In t1 I took 12 videos of a single
t1$replicate_video = 1 #In t1 I took only 1 video/culture
t2$replicate_video = 1 #In t2 I took only 1 video/culture
t3$replicate_video = 1 #In t3 I took only 1 video/culture
t4$replicate_video = 1 #In t4 I took only 1 video/culture
t5$replicate_video = 1 #In t5 I took only 1 video/culture
t6 = t6 %>%
rename(replicate_video = replicate)
t7 = t7 %>%
rename(replicate_video = replicate)
#Create an elongated version of t0 so that each of the 110 cultures can have 12 video replicates at t0.
elongating_t0 = NULL
for (video in 1:nrow(t0)){
for (ID in 1:nrow(culture_info)) {
elongating_t0 = rbind(elongating_t0, t0[video,])
}
}
ID_vector = rep(1:nrow(culture_info),
times = nrow(t0))
elongating_t0$culture_ID = ID_vector
#Merge previous data-sets
t0 = merge(culture_info,elongating_t0, by="culture_ID")
t1 = merge(culture_info,t1, by = "culture_ID")
t2 = merge(culture_info,t2, by = "culture_ID")
t3 = merge(culture_info,t3, by = "culture_ID")
t4 = merge(culture_info,t4, by = "culture_ID")
t5 = merge(culture_info,t5, by = "culture_ID")
t6 = merge(culture_info,t6, by = "culture_ID")
t7 = merge(culture_info,t7, by = "culture_ID")
ds_biomass = rbind(t0, t1, t2, t3, t4, t5, t6, t7)
rm(elongating_t0, t0, t1, t2, t3, t4, t5, t6, t7)
#Column: time_point
ds_biomass$time_point[ds_biomass$time_point=="t0"] = 0
ds_biomass$time_point[ds_biomass$time_point=="t1"] = 1
ds_biomass$time_point[ds_biomass$time_point=="t2"] = 2
ds_biomass$time_point[ds_biomass$time_point=="t3"] = 3
ds_biomass$time_point[ds_biomass$time_point=="t4"] = 4
ds_biomass$time_point[ds_biomass$time_point=="t5"] = 5
ds_biomass$time_point[ds_biomass$time_point=="t6"] = 6
ds_biomass$time_point[ds_biomass$time_point=="t7"] = 7
ds_biomass$time_point = as.character(ds_biomass$time_point)
#System nr 40 still here
#Column: day
ds_biomass$day = NA
ds_biomass$day[ds_biomass$time_point== 0] = 0
ds_biomass$day[ds_biomass$time_point== 1] = 4
ds_biomass$day[ds_biomass$time_point== 2] = 8
ds_biomass$day[ds_biomass$time_point== 3] = 12
ds_biomass$day[ds_biomass$time_point== 4] = 16
ds_biomass$day[ds_biomass$time_point== 5] = 20
ds_biomass$day[ds_biomass$time_point== 6] = 24
ds_biomass$day[ds_biomass$time_point== 7] = 28
#Column: size_of_connected_patch
ds_biomass$size_of_connected_patch[ds_biomass$eco_metaeco_type == "S"] = "S"
ds_biomass$size_of_connected_patch[ds_biomass$eco_metaeco_type == "S (S_S)"] = "S"
ds_biomass$size_of_connected_patch[ds_biomass$eco_metaeco_type == "S (S_L)"] = "L"
ds_biomass$size_of_connected_patch[ds_biomass$eco_metaeco_type == "M (M_M)"] = "M"
ds_biomass$size_of_connected_patch[ds_biomass$eco_metaeco_type == "L"] = "L"
ds_biomass$size_of_connected_patch[ds_biomass$eco_metaeco_type == "L (L_L)"] = "L"
ds_biomass$size_of_connected_patch[ds_biomass$eco_metaeco_type == "L (S_L)"] = "S"
#Column: eco_metaeco_type
ds_biomass$eco_metaeco_type = factor(ds_biomass$eco_metaeco_type,
levels = c('S',
'S (S_S)',
'S (S_L)',
'M',
'M (M_M)',
'L',
'L (L_L)',
'L (S_L)'))
ecosystems_to_take_off = 60 #Culture number 60 because it was spilled (isolated large patch, high disturbance, system nr = 40)
ds_biomass = ds_biomass %>%
filter(! culture_ID %in% ecosystems_to_take_off)
ds_for_evaporation = ds_biomass
ds_biomass = ds_biomass %>%
select(culture_ID,
patch_size,
disturbance,
metaecosystem_type,
bioarea_per_volume,
replicate_video,
time_point,
day,
metaecosystem,
system_nr,
eco_metaeco_type,
size_of_connected_patch) %>%
relocate(culture_ID,
system_nr,
disturbance,
time_point,
day,
patch_size,
metaecosystem,
metaecosystem_type,
eco_metaeco_type,
size_of_connected_patch,
replicate_video,
bioarea_per_volume)
datatable(ds_biomass,
rownames = FALSE,
options = list(scrollX = TRUE),
filter = list(position = 'top',
clear = FALSE))
ds_regional = ds_biomass %>%
filter(metaecosystem == "yes") %>%
group_by(culture_ID,
system_nr,
disturbance,
day,
time_point,
patch_size,
metaecosystem_type) %>%
summarise(patch_mean_bioarea_across_videos = mean(bioarea_per_volume)) %>%
group_by(system_nr, disturbance, day, time_point, metaecosystem_type) %>%
summarise(regional_mean_bioarea = mean(patch_mean_bioarea_across_videos))
metaecosystems_to_take_off = 40 #System 40 was the system of culture 60 that I spilled
ds_regional = ds_regional %>%
filter(! system_nr %in% metaecosystems_to_take_off)
datatable(ds_regional,
rownames = FALSE,
options = list(scrollX = TRUE),
filter = list(position = 'top',
clear = FALSE))
We want to understand if the regional biomass produced by an
ecosystem with a small and a large patch
(metaecosystem_type = S_L) is lower than the regional
biomass produced by an ecosystem with two medium patches
(metaecosystem_type = M_M).
Let’s start by plotting the single ecosystems to see that everything is fine. To make the patterns clear let’s plot the low disturbance and high disturbance in two different figures. We first plot the single meta-ecosystems and then their box plots.
ds_regional %>%
filter ( disturbance == "low") %>%
filter (metaecosystem_type == "S_L" |
metaecosystem_type == "M_M") %>%
ggplot (aes(x = day,
y = regional_mean_bioarea,
group = system_nr,
fill = system_nr,
color = system_nr,
linetype = metaecosystem_type)) +
geom_line () +
labs(x = "Day",
y = "Regional bioarea (something/µl)",
title = "Disturbance = low",
fill = "System nr",
color = "System nr",
linetype = "") +
scale_y_continuous(limits = c(0, 6250)) +
scale_x_continuous(limits = c(-2, 30)) +
scale_linetype_discrete(labels = c("medium-medium",
"small-large")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
geom_vline(xintercept = first_perturbation_day,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
ds_regional %>%
filter ( disturbance == "high") %>%
filter (metaecosystem_type == "S_L" | metaecosystem_type == "M_M") %>%
ggplot (aes(x = day,
y = regional_mean_bioarea,
group = system_nr,
fill = system_nr,
color = system_nr,
linetype = metaecosystem_type)) +
geom_line () +
labs(x = "Day",
y = "Regional bioarea (something/µl)",
title = "Disturbance = high",
fill = "System nr",
color = "System nr",
linetype = "") +
scale_y_continuous(limits = c(0, 6250)) +
scale_x_continuous(limits = c(-2, 30)) +
scale_linetype_discrete(labels = c("medium-medium",
"small-large")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
geom_vline(xintercept = first_perturbation_day,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
p_regional_low_mean = ds_regional %>%
filter(disturbance == "low") %>%
filter (metaecosystem_type == "S_L" | metaecosystem_type == "M_M") %>%
ggplot (aes(x = day,
y = regional_mean_bioarea,
group = interaction(day, metaecosystem_type),
fill = metaecosystem_type)) +
geom_boxplot() +
labs(x = "Day",
y = "Regional bioarea (something/µl)",
title = "Disturbance = low",
color='',
fill='') +
scale_fill_discrete(labels = c("medium-medium",
"small-large")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
geom_vline(xintercept = first_perturbation_day + 0.7,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
p_regional_low_mean
ds_regional %>%
filter(disturbance == "high") %>%
filter (metaecosystem_type == "S_L" | metaecosystem_type == "M_M") %>%
ggplot (aes(x = day,
y = regional_mean_bioarea,
group = interaction (day, metaecosystem_type),
fill = metaecosystem_type)) +
geom_boxplot() +
labs(x = "Day",
y = "Regional bioarea (something/µl)",
title = "Disturbance = high",
color='',
fill='') +
scale_fill_discrete(labels = c("medium-medium", "small-large")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
geom_vline(xintercept = first_perturbation_day + 0.7,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
We can see that the regional biomass was higher for meta-ecosystems with the same size, regardless of disturbance. As both disturbance levels showed the same pattern, I would just keep the one with low disturbance in the publication, as it shows a stronger pattern.
Here we want to see whether we can include time as a fixed effect by transforming the bioarea into its logarithm with base 10. As we are not considering the first two time points because they were before the first disturbance (we want to see the effects of meta-ecosystem type under disturbance), we are lucky that there is a better chance that the biomass will look like goes down in a linear fashion.
Linearity of regional bioarea ~ time
ds_regional %>%
filter(time_point >= 2) %>%
ggplot(aes(x = day,
y = regional_mean_bioarea,
group = day)) +
geom_boxplot() +
labs(title = "Without log transformation",
x = "Day",
y = "Regional bioarea (something/µl)")
Let’s check how linear the relationship is.
linear_model = lm(regional_mean_bioarea ~
day,
data = ds_regional %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
par(mfrow=c(2,3))
plot(linear_model, which = 1:5)
I’m not 100% convinced, as the residuals vs fitted has a bit of a banana shape line.
Linearity of Log10(Regional bioarea +1) ~ time
ds_regional %>%
filter(time_point >= 2) %>%
ggplot(aes(x = day,
y = log(regional_mean_bioarea + 1),
group = day)) +
geom_boxplot() +
labs(title = "With log transformation",
x = "Day",
y = "Log (regional bioarea + 1) (something/µl)")
Let’s now check how linear these two relationships are.
log_linear_model = lm(log10(regional_mean_bioarea + 1) ~
day,
data = ds_regional %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))
par(mfrow=c(2,3))
plot(log_linear_model, which = 1:5)
par(mfrow=c(1,1))
Way better. Especially the residuals vs fitted values plot doesn’t have a banana shape anymore. This seems to be a good model. Let’s then keep the log transformed bioarea.
Model selection
Let’ start from the full model.
\[ Log_{10} (Regional \: bioarea + 1) = t + M + D + tM + tD + MD + tDM + (t | system \: nr) \]
full = lmer(log10(regional_mean_bioarea + 1) ~
day * metaecosystem_type * disturbance +
(day | system_nr),
data = ds_regional %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
Should we keep the correlation in (day | system_nr)?
no_correlation = lmer(log10(regional_mean_bioarea + 1) ~
day * metaecosystem_type * disturbance +
(day | system_nr),
data = ds_regional %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(full, no_correlation)
## Data: ds_regional %>% filter(time_point >= 2) %>% filter(metaecosystem_type == ...
## Models:
## full: log10(regional_mean_bioarea + 1) ~ day * metaecosystem_type * disturbance + (day | system_nr)
## no_correlation: log10(regional_mean_bioarea + 1) ~ day * metaecosystem_type * disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## full 12 -168.78 -135.33 96.389 -192.78
## no_correlation 12 -168.78 -135.33 96.389 -192.78 0 0
Yes.
Should we keep t * M * D?
no_threeway = lmer(log10(regional_mean_bioarea + 1) ~
day +
metaecosystem_type +
disturbance +
day : metaecosystem_type +
day : disturbance +
metaecosystem_type : disturbance +
(day | system_nr),
data = ds_regional %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = 'optimx',
optCtrl = list(method = 'L-BFGS-B')))
anova(full, no_threeway)
## Data: ds_regional %>% filter(time_point >= 2) %>% filter(metaecosystem_type == ...
## Models:
## no_threeway: log10(regional_mean_bioarea + 1) ~ day + metaecosystem_type + disturbance + day:metaecosystem_type + day:disturbance + metaecosystem_type:disturbance + (day | system_nr)
## full: log10(regional_mean_bioarea + 1) ~ day * metaecosystem_type * disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_threeway 11 -170.66 -140.00 96.330 -192.66
## full 12 -168.78 -135.33 96.389 -192.78 0.1182 1 0.7309
No.
Should we keep t * M?
no_TM = lmer(log10(regional_mean_bioarea + 1) ~
day +
metaecosystem_type +
disturbance +
day : disturbance +
metaecosystem_type : disturbance +
(day | system_nr),
data = ds_regional %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(no_threeway,no_TM)
## Data: ds_regional %>% filter(time_point >= 2) %>% filter(metaecosystem_type == ...
## Models:
## no_TM: log10(regional_mean_bioarea + 1) ~ day + metaecosystem_type + disturbance + day:disturbance + metaecosystem_type:disturbance + (day | system_nr)
## no_threeway: log10(regional_mean_bioarea + 1) ~ day + metaecosystem_type + disturbance + day:metaecosystem_type + day:disturbance + metaecosystem_type:disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_TM 10 -172.62 -144.74 96.308 -192.62
## no_threeway 11 -170.66 -140.00 96.330 -192.66 0.0431 1 0.8356
No.
Should we keep t * D?
no_TD = lmer(log10(regional_mean_bioarea + 1) ~
day +
metaecosystem_type +
disturbance +
metaecosystem_type : disturbance +
(day | system_nr),
data = ds_regional %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(no_TM, no_TD)
## Data: ds_regional %>% filter(time_point >= 2) %>% filter(metaecosystem_type == ...
## Models:
## no_TD: log10(regional_mean_bioarea + 1) ~ day + metaecosystem_type + disturbance + metaecosystem_type:disturbance + (day | system_nr)
## no_TM: log10(regional_mean_bioarea + 1) ~ day + metaecosystem_type + disturbance + day:disturbance + metaecosystem_type:disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_TD 9 -163.54 -138.46 90.771 -181.54
## no_TM 10 -172.62 -144.74 96.308 -192.62 11.074 1 0.0008756 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
No.
Should we keep M * D?
no_MD = lmer(log10(regional_mean_bioarea + 1) ~
day +
metaecosystem_type +
disturbance +
day : disturbance +
(day | system_nr),
data = ds_regional %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(no_TM, no_MD)
## Data: ds_regional %>% filter(time_point >= 2) %>% filter(metaecosystem_type == ...
## Models:
## no_MD: log10(regional_mean_bioarea + 1) ~ day + metaecosystem_type + disturbance + day:disturbance + (day | system_nr)
## no_TM: log10(regional_mean_bioarea + 1) ~ day + metaecosystem_type + disturbance + day:disturbance + metaecosystem_type:disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_MD 9 -173.87 -148.78 95.932 -191.87
## no_TM 10 -172.62 -144.74 96.308 -192.62 0.7513 1 0.3861
No.
Should we keep the random effect of system nr on the time slopes (day | system_nr)?
no_random_slopes = lmer(log10(regional_mean_bioarea + 1) ~
day +
metaecosystem_type +
disturbance +
day : disturbance +
(1 | system_nr),
data = ds_regional %>%
filter(time_point >= 2) %>%
filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
anova(no_MD, no_random_slopes)
## Data: ds_regional %>% filter(time_point >= 2) %>% filter(metaecosystem_type == ...
## Models:
## no_random_slopes: log10(regional_mean_bioarea + 1) ~ day + metaecosystem_type + disturbance + day:disturbance + (1 | system_nr)
## no_MD: log10(regional_mean_bioarea + 1) ~ day + metaecosystem_type + disturbance + day:disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_random_slopes 7 -173.23 -153.71 93.613 -187.23
## no_MD 9 -173.87 -148.78 95.932 -191.87 4.6394 2 0.0983 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Yes.
Best model
Therefore, our best model is:
\[ log_{10}(regional \: bioarea + 1) = t + M + D + t*D + (t| system \: nr) \]
This model is the best model when looking at all time points coming after the first disturbance event (t2->t7). Assuming that this model holds for also other sections of the time series, the r squared of the model and of meta-ecosystem type is as follows.
#Create a table in which the regional biomass has been log transformed.
### --- INITIALISE TABLE --- ###
columns = c("model", "time_point", "AIC", "BIC", "R2_mixed", "R2_fixed", "R2_mixed_M", "R2_fixed_M")
log_time_table = data.frame(matrix(ncol = length(columns), nrow = 0))
colnames(log_time_table) = columns
### --- POPULATE THE TABLE --- ###
for (last_point in 4:7) {
full_model = lmer(log10(regional_mean_bioarea + 1) ~
day +
metaecosystem_type +
disturbance +
day : disturbance +
(day | system_nr),
data = ds_regional %>%
filter(time_point >= 2) %>%
filter(time_point <= last_point) %>%
filter(metaecosystem_type == "M_M" |
metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
null_model = lm(regional_mean_bioarea ~
1 ,
data = ds_regional %>%
filter(time_point >= 2) %>%
filter(time_point <= last_point) %>%
filter(metaecosystem_type == "M_M" |
metaecosystem_type == "S_L"))
metaeco_null_model = lmer(log10(regional_mean_bioarea + 1) ~
day +
disturbance +
day : disturbance +
(day | system_nr),
data = ds_regional %>%
filter(time_point >= 2) %>%
filter(time_point <= last_point) %>%
filter(metaecosystem_type == "M_M" |
metaecosystem_type == "S_L"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
log_time_table = update_all_models_table("t + M + D + t * M * D + (t || system_nr)",
log_time_table,
full_model,
null_model,
metaeco_null_model,
"mixed")
}
## Warning: 'r.squaredGLMM' now calculates a revised statistic. See the help page.
datatable(log_time_table,
rownames = FALSE,
options = list(pageLength = 100,
scrollX = TRUE,
autoWidth = TRUE,
columnDefs = list(list(targets=c(0),visible=TRUE, width='160'),
list(targets=c(1), visible=TRUE, width='10'),
list(targets=c(2), visible=TRUE, width='10'),
list(targets=c(3), visible=TRUE, width='10'),
list(targets=c(4), visible=TRUE, width='10'),
list(targets=c(5), visible=TRUE, width='10'),
list(targets=c(6), visible=TRUE, width='10'),
list(targets=c(7), visible=TRUE, width='10'),
list(targets='_all', visible=FALSE))),
caption = "
M = Meta-ecosystem type,
D = disturbance,
(1 | t) = random effect of time on the intercept,
(1 | ID) = random effect of meta-ecosystem ID on the intercept,
|| = no correlation between intercept and slope,
| = correlation between intercept and slope,
R2 = r squared of the whole model,
R2_fixed = fixed part of the mixed model,
mixed_R2 = r squared when considering both fixed and random effects (conditional r squared),
fixed_R2 = r squared when considering only the fixed effects (marginal r squared)")
Notice that I did not include the t3-t4 time series, as when running the model it gives me the following error:
Model selection
I’ll do model selection only on time point number 3 (however, I have done it also with time point 4,5,6,7 and they all give me the same result). Let’s start from the full model.
I can’t construct it from a mixed model because the following error pops up:
full = lm(regional_mean_bioarea ~
metaecosystem_type +
disturbance +
disturbance : metaecosystem_type,
data = ds_regional %>%
filter(time_point == 3) %>%
filter(metaecosystem_type == "M_M" |
metaecosystem_type == "S_L"))
Should we keep D * M?
no_MD = lm(regional_mean_bioarea ~
metaecosystem_type +
disturbance,
data = ds_regional %>%
filter(time_point == 3) %>%
filter(metaecosystem_type == "M_M" |
metaecosystem_type == "S_L"))
anova(full, no_MD)
## Analysis of Variance Table
##
## Model 1: regional_mean_bioarea ~ metaecosystem_type + disturbance + disturbance:metaecosystem_type
## Model 2: regional_mean_bioarea ~ metaecosystem_type + disturbance
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 16 3433038
## 2 17 3732239 -1 -299201 1.3945 0.2549
AIC(full, no_MD)
## df AIC
## full 5 307.8220
## no_MD 4 307.4933
No.
Best model
Therefore, our best model is
\[ Regional \: Bioarea = M + D \]
Here I test whether the bioarea of small-large we see is due to meta-ecosystem dynamics or simply to their area.
To make the data-set, I’m going through the following steps:
isolated_S = ds_biomass %>%
filter(eco_metaeco_type == "S") %>%
group_by(system_nr, disturbance, time_point, day, patch_size) %>%
summarise(bioarea_per_volume_across_videos = mean(bioarea_per_volume))
isolated_L = ds_biomass %>%
filter(eco_metaeco_type == "L") %>%
group_by(system_nr, disturbance, time_point, day, patch_size) %>%
summarise(bioarea_per_volume_across_videos = mean(bioarea_per_volume))
### Check that they have the same number of patches, then check which one is the oen missing
length(unique(isolated_S$system_nr))
length(unique(isolated_L$system_nr))
unique(isolated_S$system_nr)
unique(isolated_L$system_nr)
#Take off one of the high disturbance small patches (system nr 50) because isolated small misses one at high disturbance
isolated_L = isolated_L%>%
filter(!system_nr == 50)
n_isolated_patches = length(unique(isolated_L$system_nr))
n_time_points = 8
number_for_pairing = rep( c( 1:n_isolated_patches), each = n_time_points)
number_for_pairing = as.data.frame(number_for_pairing)
colnames(number_for_pairing) = "number_for_pairing"
isolated_S = cbind(isolated_S, number_for_pairing)
isolated_L = cbind(isolated_L, number_for_pairing)
isolated_S_and_L = rbind(isolated_S, isolated_L)
SL_from_isolated = isolated_S_and_L %>%
group_by(disturbance, time_point, day, number_for_pairing) %>%
summarise(regional_mean_bioarea = mean(bioarea_per_volume_across_videos)) %>%
mutate(metaecosystem_type = "S_L_from_isolated") %>%
mutate(system_nr = number_for_pairing)
ds_regional_with_SL_from_isolated = rbind(SL_from_isolated, ds_regional)
ds_regional_with_SL_from_isolated %>%
filter ( disturbance == "low") %>%
filter (metaecosystem_type == "S_L" | metaecosystem_type == "S_L_from_isolated") %>%
ggplot (aes(x = day,
y = regional_mean_bioarea,
group = system_nr,
fill = system_nr,
color = system_nr,
linetype = metaecosystem_type)) +
geom_line () +
labs(x = "Day",
y = "Regional bioarea (something/µl)",
title = "Disturbance = low",
fill = "System nr",
color = "System nr",
linetype = "") +
scale_y_continuous(limits = c(0, 6250)) +
scale_x_continuous(limits = c(-2, 30)) +
scale_linetype_discrete(labels = c("small-large",
"small-large \n from isolated")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
geom_vline(xintercept = first_perturbation_day,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
ds_regional_with_SL_from_isolated %>%
filter ( disturbance == "high") %>%
filter (metaecosystem_type == "S_L" | metaecosystem_type == "S_L_from_isolated") %>%
ggplot (aes(x = day,
y = regional_mean_bioarea,
group = system_nr,
fill = system_nr,
color = system_nr,
linetype = metaecosystem_type)) +
geom_line () +
labs(title = "Disturbance = high",
x = "Day",
y = "Regional bioarea (something/µl)",
fill = "System nr",
color = "System nr",
linetype = "") +
scale_y_continuous(limits = c(0, 6250)) +
scale_x_continuous(limits = c(-2, 30)) +
scale_linetype_discrete(labels = c("small-large",
"small-large \n from isolated")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
geom_vline(xintercept = first_perturbation_day,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
ds_regional_with_SL_from_isolated %>%
filter(disturbance == "low") %>%
filter(metaecosystem_type == "S_L" | metaecosystem_type == "S_L_from_isolated") %>%
ggplot(aes(x = day,
y = regional_mean_bioarea,
group = interaction(day, metaecosystem_type),
fill = metaecosystem_type)) +
geom_boxplot() +
labs(title = "Disturbance = low",
x = "Day",
y = "Regional bioarea (something/microlitre)",
fill = "") +
scale_fill_discrete(labels = c("small-large", "isolated small & \n isolated large")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
geom_vline(xintercept = first_perturbation_day + 0.7,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
ds_regional_with_SL_from_isolated %>%
filter(disturbance == "high") %>%
filter(metaecosystem_type == "S_L" | metaecosystem_type == "S_L_from_isolated") %>%
ggplot(aes(x = day,
y = regional_mean_bioarea,
group = interaction(day, metaecosystem_type),
fill = metaecosystem_type)) +
geom_boxplot() +
labs(title = "Disturbance = high",
x = "Day",
y = "Regional bioarea (something/microlitre)",
fill = "") +
scale_fill_discrete(labels = c("small-large", "isolated small & \n isolated large")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
geom_vline(xintercept = first_perturbation_day + 0.7,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
mixed_model = lmer(log10(regional_mean_bioarea +1) ~
day * metaecosystem_type * disturbance +
(day | system_nr),
data = ds_regional_with_SL_from_isolated %>%
filter(metaecosystem_type == "S_L" |
metaecosystem_type == "S_L_from_isolated") %>%
filter(time_point >= 2),
REML = FALSE,
control = lmerControl (optimizer = "Nelder_Mead"))
null_model = lmer(log10(regional_mean_bioarea +1) ~
day * disturbance +
(day | system_nr),
data = ds_regional_with_SL_from_isolated %>%
filter(metaecosystem_type == "S_L" |
metaecosystem_type == "S_L_from_isolated") %>%
filter(time_point >= 2),
REML = FALSE,
control = lmerControl (optimizer = "Nelder_Mead"))
anova(mixed_model, null_model)
## Data: ds_regional_with_SL_from_isolated %>% filter(metaecosystem_type == ...
## Models:
## null_model: log10(regional_mean_bioarea + 1) ~ day * disturbance + (day | system_nr)
## mixed_model: log10(regional_mean_bioarea + 1) ~ day * metaecosystem_type * disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## null_model 8 -154.45 -132.56 85.224 -170.45
## mixed_model 12 -147.96 -115.12 85.980 -171.96 1.5111 4 0.8247
Let’s now create a regional bioarea data-set in which all possible combinations of small-large coupled patches from isolated are created.
system_nr_S_low = unique(isolated_S$system_nr)[1:5]
system_nr_L_low = unique(isolated_L$system_nr)[1:5]
system_nr_S_high = unique(isolated_S$system_nr)[6:9]
system_nr_L_high = unique(isolated_L$system_nr)[6:9]
low_pairs = expand.grid(system_nr_S_low,system_nr_L_low)
high_pairs = expand.grid(system_nr_S_high, system_nr_L_high)
pairs = rbind(low_pairs, high_pairs)
number_of_pairs = nrow(pairs)
SL_from_isolated_all_combinations = NULL
for (pair in 1:number_of_pairs){
SL_from_isolated_one_combination = ds_biomass %>%
filter(system_nr %in% pairs[pair,]) %>%
group_by(disturbance, day, time_point, system_nr) %>%
summarise(regional_bioarea_across_videos = mean(bioarea_per_volume)) %>%
group_by(disturbance, day, time_point) %>%
summarise(regional_mean_bioarea = mean(regional_bioarea_across_videos)) %>%
mutate(system_nr = 1000 + pair) %>%
mutate(metaecosystem_type = "S_L_from_isolated")
SL_from_isolated_all_combinations = rbind(SL_from_isolated_one_combination,
SL_from_isolated_all_combinations)
}
ds_regional_with_SL_from_isolated = rbind(SL_from_isolated_all_combinations, ds_regional)
Next step: I should try to bootstrap 5 m
ds_regional %>%
filter(!metaecosystem_type == "S_L") %>%
filter ( disturbance == "low") %>%
ggplot (aes(x = day,
y = regional_mean_bioarea,
group = system_nr,
fill = system_nr,
color = system_nr,
linetype = metaecosystem_type)) +
geom_line () +
labs(x = "Day",
y = "Regional bioarea (something/µl)",
title = "Disturbance = low",
fill = "System nr",
linetype = "") +
scale_y_continuous(limits = c(0, 6250)) +
scale_x_continuous(limits = c(-2, 30)) +
scale_colour_continuous(guide = "none") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_linetype_discrete(labels = c("large-large",
"medium-medium",
"small-small")) +
geom_vline(xintercept = first_perturbation_day,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
ds_regional %>%
filter(!metaecosystem_type == "S_L") %>%
filter ( disturbance == "high") %>%
ggplot (aes(x = day,
y = regional_mean_bioarea,
group = system_nr,
fill = system_nr,
color = system_nr,
linetype = metaecosystem_type)) +
geom_line () +
labs(x = "Day",
y = "Regional bioarea (something/µl)",
title = "Disturbance = high",
fill = "System nr",
linetype = "") +
scale_y_continuous(limits = c(0, 6250)) +
scale_x_continuous(limits = c(-2, 30)) +
scale_colour_continuous(guide = "none") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_linetype_discrete(labels = c("large-large",
"medium-medium",
"small-small")) +
geom_vline(xintercept = first_perturbation_day,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
ds_regional %>%
filter(disturbance == "low") %>%
filter(!metaecosystem_type == "S_L") %>%
ggplot(aes(x = day,
y = regional_mean_bioarea,
group = interaction(day, metaecosystem_type),
fill = metaecosystem_type)) +
geom_boxplot() +
labs(title = "Disturbance = low",
x = "Day",
y = "Local bioarea (something/μl)",
fill = "") +
#scale_fill_discrete(labels = c("isolated large", "isolated medium", "isolated small")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("large-large",
"medium-medium",
"small-small")) +
geom_vline(xintercept = first_perturbation_day + 0.7,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
ds_regional %>%
filter(disturbance == "high") %>%
filter(!metaecosystem_type == "S_L") %>%
ggplot(aes(x = day,
y = regional_mean_bioarea,
group = interaction(day, metaecosystem_type),
fill = metaecosystem_type)) +
geom_boxplot() +
labs(title = "Disturbance = high",
x = "Day",
y = "Local bioarea (something/μl)",
fill = "") +
#scale_fill_discrete(labels = c("isolated large", "isolated medium", "isolated small")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("large-large",
"medium-medium",
"small-small")) +
geom_vline(xintercept = first_perturbation_day + 0.7,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
Interesting. It seems like there’s not much difference between the medium-medium and the large-large.
Now I’ll:
average_biomass_S_isolated_low = ds_biomass %>%
filter(disturbance == "low") %>%
filter(eco_metaeco_type == "S") %>%
group_by(culture_ID, time_point, day) %>%
summarise(bioarea_per_volume_across_videos = mean(bioarea_per_volume)) %>% #Across videos
group_by(time_point, day) %>%
summarise(mean_bioarea_per_volume = mean(bioarea_per_volume_across_videos)) #Across cultures
average_biomass_S_isolated_high = ds_biomass %>%
filter(disturbance == "high") %>%
filter(eco_metaeco_type == "S") %>%
group_by(culture_ID, time_point, day) %>%
summarise(bioarea_per_volume_across_videos = mean(bioarea_per_volume)) %>%
group_by(time_point, day) %>%
summarise(mean_bioarea_per_volume = mean(bioarea_per_volume_across_videos))
for (time_point_input in 0:7) {
ds_biomass$isolated_control[ds_biomass$patch_size == "S" &
ds_biomass$disturbance == "low" &
ds_biomass$time_point == time_point_input] =
subset(average_biomass_S_isolated_low,
time_point == time_point_input)$mean_bioarea_per_volume
}
for (time_point_input in 0:7) {
ds_biomass$isolated_control[ds_biomass$patch_size == "S" &
ds_biomass$disturbance == "high" &
ds_biomass$time_point == time_point_input] =
subset(average_biomass_S_isolated_high,
time_point == time_point_input)$mean_bioarea_per_volume
}
ds_biomass = ds_biomass %>%
mutate(isolated_control = as.numeric(isolated_control)) %>%
mutate(RR_bioarea_per_volume = bioarea_per_volume / isolated_control)
### --- SINGLE ECOSYSTEMS --- ###
ds_biomass %>%
filter(disturbance == "low") %>%
#filter(eco_metaeco_type == "S (S_S)" | eco_metaeco_type == "S (S_L)") %>%
filter(patch_size == "S") %>%
ggplot(aes(x = day,
y = bioarea_per_volume,
group = system_nr,
fill = system_nr,
color = system_nr,
linetype = eco_metaeco_type)) +
geom_line(stat = "summary", fun = "mean") +
labs(x = "Day",
y = "Local bioarea (something/μl)",
title = "Disturbance = low",
linetype = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_linetype_discrete(labels = c("small isolated",
"small connected to small",
"small connected to large")) +
geom_vline(xintercept = first_perturbation_day,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
ds_biomass %>%
filter(disturbance == "high") %>%
#filter(eco_metaeco_type == "S (S_S)" | eco_metaeco_type == "S (S_L)") %>%
filter(patch_size == "S") %>%
ggplot(aes(x = day,
y = bioarea_per_volume,
group = system_nr,
fill = system_nr,
color = system_nr,
linetype = eco_metaeco_type)) +
geom_line(stat = "summary", fun = "mean") +
labs(title = "Disturbance = high",
x = "Day",
y = "Local bioarea (something/μl)",
linetype = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_linetype_discrete(labels = c("small isolated",
"small connected to small",
"small connected to large")) +
geom_vline(xintercept = first_perturbation_day,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
### --- BOXPLOTS --- ###
local_small_low_plot = ds_biomass %>%
filter(disturbance == "low") %>%
filter(patch_size == "S") %>%
ggplot(aes(x = day,
y = bioarea_per_volume,
group = interaction(day,eco_metaeco_type),
fill = eco_metaeco_type)) +
geom_boxplot() +
labs(x = "Day",
y = "Local bioarea (something/μl)",
title = "Disturbance = low",
fill = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("small isolated",
"small connected to small",
"small connected to large")) +
geom_vline(xintercept = first_perturbation_day + 0.7,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
local_small_low_plot
ds_biomass %>%
filter(disturbance == "high") %>%
filter(patch_size == "S") %>%
ggplot(aes(x = day,
y = bioarea_per_volume,
group = interaction(day,eco_metaeco_type),
fill = eco_metaeco_type)) +
geom_boxplot() +
labs(title = "Disturbance = high",
x = "Day",
y = "Local bioarea (something/μl)",
fill = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("small isolated",
"small connected to small",
"small connected to large")) +
geom_vline(xintercept = first_perturbation_day + 0.7,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
### --- REPLOT THE SMALL PATCHES BUT AS RESPONSE RATIO --- ###
ds_biomass %>%
filter(disturbance == "low") %>%
filter(patch_size == "S") %>%
group_by(system_nr, day, eco_metaeco_type) %>%
summarise(RR_bioarea_per_volume = mean(RR_bioarea_per_volume)) %>% #Across videos
ggplot(aes(x = day,
y = RR_bioarea_per_volume,
group = interaction(day, eco_metaeco_type),
fill = eco_metaeco_type)) +
geom_boxplot() +
labs(title = "Disturbance = low",
x = "Day",
y = "Response ratio bioarea (something/μl)",
fill = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.3, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("small isolated",
"small connected to small",
"small connected to large")) +
geom_vline(xintercept = first_perturbation_day + 0.7,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation, Response ratio bioarea: Bioarea/Mean bioarea small isolated")
ds_biomass %>%
filter(disturbance == "high") %>%
filter(patch_size == "S") %>%
group_by(system_nr, day, eco_metaeco_type) %>%
summarise(RR_bioarea_per_volume = mean(RR_bioarea_per_volume)) %>% #Across videos
ggplot(aes(x = day,
y = RR_bioarea_per_volume,
group = interaction(day, eco_metaeco_type),
fill = eco_metaeco_type)) +
geom_boxplot() +
labs(title = "Disturbance = high",
x = "Day",
y = "Response ratio bioarea (something/μl)",
fill = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.3, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("small isolated",
"small connected to small",
"small connected to large")) +
geom_vline(xintercept = first_perturbation_day + 0.7,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation, Response ratio bioarea: Bioarea/Mean bioarea small isolated")
small_patches_figure = ds_biomass %>%
filter(disturbance == "low") %>%
filter(eco_metaeco_type == "S (S_S)" | eco_metaeco_type == "S (S_L)") %>%
group_by(system_nr, day, eco_metaeco_type) %>%
summarise(RR_bioarea_per_volume = mean(RR_bioarea_per_volume)) %>% #Across videos
ggplot(aes(x = day,
y = RR_bioarea_per_volume,
group = interaction(day, eco_metaeco_type),
fill = eco_metaeco_type)) +
geom_boxplot() +
labs(x = "Day",
y = "Response ratio bioarea (something/μl)",
fill = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.3, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("small connected to small",
"small connected to large")) +
geom_vline(xintercept = first_perturbation_day + 0.7,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation, Response ratio bioarea: Bioarea/Mean bioarea small isolated")
Let’s start from the full model:
\[ RR (bioarea) = t + M + D + t*M + t*D + M*D + t*M+D + (t | system \: nr) \]
And let’s consider all data points from the second to the seventh.
first_time_point = 2
last_time_point = 7
full_model = lmer(RR_bioarea_per_volume ~
day * eco_metaeco_type * disturbance +
(day | system_nr),
data = ds_biomass %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" |
eco_metaeco_type == "S (S_L)"),
REML = FALSE
)
Should we keep the interaction between the intercept and slope in (day | system_nr)?
no_interaction = lmer(RR_bioarea_per_volume ~
day * eco_metaeco_type * disturbance +
(day || system_nr),
data = ds_biomass %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" |
eco_metaeco_type == "S (S_L)"),
REML = FALSE
)
anova(full_model, no_interaction)
## Data: ds_biomass %>% filter(time_point >= first_time_point) %>% filter(time_point <= ...
## Models:
## no_interaction: RR_bioarea_per_volume ~ day * eco_metaeco_type * disturbance + ((1 | system_nr) + (0 + day | system_nr))
## full_model: RR_bioarea_per_volume ~ day * eco_metaeco_type * disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_interaction 11 1078.8 1117.1 -528.39 1056.8
## full_model 12 1075.2 1117.0 -525.61 1051.2 5.5665 1 0.01831 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Yes.
Should we keep the effects on the slope?
no_slope = lmer(RR_bioarea_per_volume ~
day * eco_metaeco_type * disturbance +
(1 | system_nr),
data = ds_biomass %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" |
eco_metaeco_type == "S (S_L)"),
REML = FALSE
)
anova(full_model, no_slope)
## Data: ds_biomass %>% filter(time_point >= first_time_point) %>% filter(time_point <= ...
## Models:
## no_slope: RR_bioarea_per_volume ~ day * eco_metaeco_type * disturbance + (1 | system_nr)
## full_model: RR_bioarea_per_volume ~ day * eco_metaeco_type * disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_slope 10 1086.3 1121.1 -533.16 1066.3
## full_model 12 1075.2 1117.0 -525.61 1051.2 15.103 2 0.0005253 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Yes.
Should we keep t * M * D?
no_three_way = lmer(RR_bioarea_per_volume ~
day +
eco_metaeco_type +
disturbance +
day : eco_metaeco_type +
day : disturbance +
eco_metaeco_type : disturbance +
(day | system_nr),
data = ds_biomass %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" |
eco_metaeco_type == "S (S_L)"),
REML = FALSE
)
anova(full_model, no_three_way)
## Data: ds_biomass %>% filter(time_point >= first_time_point) %>% filter(time_point <= ...
## Models:
## no_three_way: RR_bioarea_per_volume ~ day + eco_metaeco_type + disturbance + day:eco_metaeco_type + day:disturbance + eco_metaeco_type:disturbance + (day | system_nr)
## full_model: RR_bioarea_per_volume ~ day * eco_metaeco_type * disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_three_way 11 1073.4 1111.7 -525.72 1051.4
## full_model 12 1075.2 1117.0 -525.61 1051.2 0.221 1 0.6383
No.
Should we keep t * M?
no_TM = lmer(RR_bioarea_per_volume ~
day +
eco_metaeco_type +
disturbance +
day : disturbance +
eco_metaeco_type : disturbance +
(day | system_nr),
data = ds_biomass %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" |
eco_metaeco_type == "S (S_L)"),
REML = FALSE
)
anova(no_three_way, no_TM)
## Data: ds_biomass %>% filter(time_point >= first_time_point) %>% filter(time_point <= ...
## Models:
## no_TM: RR_bioarea_per_volume ~ day + eco_metaeco_type + disturbance + day:disturbance + eco_metaeco_type:disturbance + (day | system_nr)
## no_three_way: RR_bioarea_per_volume ~ day + eco_metaeco_type + disturbance + day:eco_metaeco_type + day:disturbance + eco_metaeco_type:disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_TM 10 1076.8 1111.6 -528.41 1056.8
## no_three_way 11 1073.4 1111.7 -525.72 1051.4 5.3865 1 0.02029 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Yes.
Should we keep t * D?
no_TD = lmer(RR_bioarea_per_volume ~
day +
eco_metaeco_type +
disturbance +
day : eco_metaeco_type +
eco_metaeco_type : disturbance +
(day | system_nr),
data = ds_biomass %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" |
eco_metaeco_type == "S (S_L)"),
REML = FALSE
)
anova(no_three_way, no_TD)
## Data: ds_biomass %>% filter(time_point >= first_time_point) %>% filter(time_point <= ...
## Models:
## no_TD: RR_bioarea_per_volume ~ day + eco_metaeco_type + disturbance + day:eco_metaeco_type + eco_metaeco_type:disturbance + (day | system_nr)
## no_three_way: RR_bioarea_per_volume ~ day + eco_metaeco_type + disturbance + day:eco_metaeco_type + day:disturbance + eco_metaeco_type:disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_TD 10 1074.0 1108.8 -526.98 1054.0
## no_three_way 11 1073.4 1111.7 -525.72 1051.4 2.5254 1 0.112
Yes.
Should we keep M * D?
no_MD = lmer(RR_bioarea_per_volume ~
day +
eco_metaeco_type +
disturbance +
day : eco_metaeco_type +
day : disturbance +
(day | system_nr),
data = ds_biomass %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" |
eco_metaeco_type == "S (S_L)"),
REML = FALSE
)
anova(no_three_way, no_MD)
## Data: ds_biomass %>% filter(time_point >= first_time_point) %>% filter(time_point <= ...
## Models:
## no_MD: RR_bioarea_per_volume ~ day + eco_metaeco_type + disturbance + day:eco_metaeco_type + day:disturbance + (day | system_nr)
## no_three_way: RR_bioarea_per_volume ~ day + eco_metaeco_type + disturbance + day:eco_metaeco_type + day:disturbance + eco_metaeco_type:disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_MD 10 1071.5 1106.3 -525.74 1051.5
## no_three_way 11 1073.4 1111.7 -525.72 1051.4 0.0487 1 0.8253
No.
Should we keep D?
no_D = lmer(RR_bioarea_per_volume ~
day +
eco_metaeco_type +
day : eco_metaeco_type +
day : disturbance +
(day | system_nr),
data = ds_biomass %>%
filter(time_point >= first_time_point) %>%
filter(time_point <= last_time_point) %>%
filter(eco_metaeco_type== "S (S_S)" |
eco_metaeco_type == "S (S_L)"),
REML = FALSE,
control = lmerControl (optimizer = "Nelder_Mead")
)
anova(no_MD, no_D)
## Data: ds_biomass %>% filter(time_point >= first_time_point) %>% filter(time_point <= ...
## Models:
## no_D: RR_bioarea_per_volume ~ day + eco_metaeco_type + day:eco_metaeco_type + day:disturbance + (day | system_nr)
## no_MD: RR_bioarea_per_volume ~ day + eco_metaeco_type + disturbance + day:eco_metaeco_type + day:disturbance + (day | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_D 9 1069.6 1100.9 -525.79 1051.6
## no_MD 10 1071.5 1106.3 -525.74 1051.5 0.0897 1 0.7645
Wired, AIC tells us that we should not. Let’s then compute how more likely is the first model to be true than the second one. I will compute it using a relative likelihood (see Burnham, Anderson, and Huyvaert (2011)):
\[ e^{0.5 (AIC_{min} - AIC_{max})} \]
AIC_min = 2172.4
AIC_max = 2174.3
exp(0.5 * (AIC_min - AIC_max))
## [1] 0.386741
The probability that the model without disturbance is better is only 40% more. I will then keep the disturbance.
Then our best model is:
\[ RR (bioarea) = t + M + D + t*D + (t | system \: nr) \]
Let’s then calculate the R squared and stuff.
#Create a table in which the regional biomass has been log transformed.
### --- INITIALISE TABLE --- ###
columns = c("model", "time_point", "AIC", "BIC", "R2_mixed", "R2_fixed", "R2_mixed_M", "R2_fixed_M")
small_patches_matrix = matrix(ncol = length(columns), nrow = 0)
small_patches_table = data.frame(small_patches_matrix)
colnames(small_patches_table) = columns
### --- POPULATE THE TABLE --- ###
for (last_point in 4:7) {
full_model = lmer(RR_bioarea_per_volume ~
day +
eco_metaeco_type +
disturbance +
day : disturbance +
day : eco_metaeco_type +
(day | system_nr),
data = ds_biomass %>%
filter(time_point >= 2) %>%
filter(time_point <= last_point) %>%
filter(eco_metaeco_type== "S (S_S)" |
eco_metaeco_type == "S (S_L)"),
REML = FALSE,
control = lmerControl(optimizer = 'optimx',
optCtrl = list(method = 'L-BFGS-B')))
r.squaredGLMM(full_model)
null_model = lm(RR_bioarea_per_volume ~
1,
data = ds_biomass %>%
filter(time_point >= 2) %>%
filter(time_point <= last_point) %>%
filter(eco_metaeco_type== "S (S_S)" |
eco_metaeco_type == "S (S_L)"))
metaeco_null = lmer(RR_bioarea_per_volume ~
day +
disturbance +
day : disturbance +
day : eco_metaeco_type +
(day | system_nr),
data = ds_biomass %>%
filter(time_point >= 2) %>%
filter(time_point <= last_point) %>%
filter(eco_metaeco_type== "S (S_S)" |
eco_metaeco_type == "S (S_L)"),
REML = FALSE,
control = lmerControl(optimizer = "Nelder_Mead"))
r.squaredGLMM(metaeco_null)
small_patches_table = update_all_models_table("t + M + D + t*M + t*D + (t | ID)",
small_patches_table,
full_model,
null_model,
metaeco_null,
"mixed")
}
datatable(small_patches_table,
rownames = FALSE,
options = list(pageLength = 100,
scrollX = TRUE,
autoWidth = TRUE,
columnDefs = list(list(targets=c(0),visible=TRUE, width='160'),
list(targets=c(1), visible=TRUE, width='10'),
list(targets=c(2), visible=TRUE, width='10'),
list(targets=c(3), visible=TRUE, width='10'),
list(targets=c(4), visible=TRUE, width='10'),
list(targets=c(5), visible=TRUE, width='10'),
list(targets=c(6), visible=TRUE, width='10'),
list(targets=c(7), visible=TRUE, width='10'),
list(targets='_all', visible=FALSE))),
caption = "
M = Meta-ecosystem type,
D = disturbance,
(1 | t) = random effect of time on the intercept,
(1 | ID) = random effect of meta-ecosystem ID on the intercept,
|| = no correlation between intercept and slope,
| = correlation between intercept and slope,
R2 = r squared of the whole model,
R2_fixed = fixed part of the mixed model,
mixed_R2 = r squared when considering both fixed and random effects (conditional r squared),
fixed_R2 = r squared when considering only the fixed effects (marginal r squared)")
I’ll show model selection only on time point number 3 (however, I will do it also for the other time points). Let’s start from the full model.
time_point_input = 4
full = lmer(RR_bioarea_per_volume ~
metaecosystem_type +
disturbance +
disturbance : metaecosystem_type +
(1 | system_nr),
data = ds_biomass %>%
filter(time_point == time_point_input) %>%
filter(eco_metaeco_type== "S (S_S)" |
eco_metaeco_type == "S (S_L)"),
REML = FALSE)
Should we keep the random effect?
no_random = lm(RR_bioarea_per_volume ~
metaecosystem_type +
disturbance +
disturbance : metaecosystem_type,
data = ds_biomass %>%
filter(time_point == time_point_input) %>%
filter(eco_metaeco_type== "S (S_S)" |
eco_metaeco_type == "S (S_L)"))
anova(full, no_random)
## Data: ds_biomass %>% filter(time_point == time_point_input) %>% filter(eco_metaeco_type == ...
## Models:
## no_random: RR_bioarea_per_volume ~ metaecosystem_type + disturbance + disturbance:metaecosystem_type
## full: RR_bioarea_per_volume ~ metaecosystem_type + disturbance + disturbance:metaecosystem_type + (1 | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_random 5 78.541 85.547 -34.271 68.541
## full 6 80.541 88.949 -34.271 68.541 0 1 1
No. But wired. I’ll keep it anyway.
Should we keep D * M?
no_MD = lmer(RR_bioarea_per_volume ~
metaecosystem_type +
disturbance +
(1 | system_nr),
data = ds_biomass %>%
filter(time_point == time_point_input) %>%
filter(eco_metaeco_type== "S (S_S)" |
eco_metaeco_type == "S (S_L)"),
REML = FALSE)
anova(full, no_MD)
## Data: ds_biomass %>% filter(time_point == time_point_input) %>% filter(eco_metaeco_type == ...
## Models:
## no_MD: RR_bioarea_per_volume ~ metaecosystem_type + disturbance + (1 | system_nr)
## full: RR_bioarea_per_volume ~ metaecosystem_type + disturbance + disturbance:metaecosystem_type + (1 | system_nr)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_MD 5 85.584 92.590 -37.792 75.584
## full 6 80.541 88.949 -34.271 68.541 7.0423 1 0.00796 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
No.
Best model
Therefore, our best model is
\[ Regional \: Bioarea = M + D + (1 | ID) \]
### --- SINGLE PATCHES --- ###
ds_biomass %>%
filter(disturbance == "low") %>%
filter(patch_size == "L") %>%
ggplot(aes(x = day,
y = bioarea_per_volume,
group = system_nr,
fill = system_nr,
color = system_nr,
linetype = eco_metaeco_type)) +
geom_line(stat = "summary", fun = "mean") +
labs(x = "Day",
y = "Local bioarea (something/μl)",
title = "Disturbance = low",
linetype = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_linetype_discrete(labels = c("large isolated",
"large connected to large",
"large connected to small")) +
geom_vline(xintercept = first_perturbation_day,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
ds_biomass %>%
filter(disturbance == "high") %>%
filter(patch_size == "L") %>%
ggplot(aes(x = day,
y = bioarea_per_volume,
group = system_nr,
fill = system_nr,
color = system_nr,
linetype = eco_metaeco_type)) +
geom_line(stat = "summary", fun = "mean") +
labs(x = "Day",
y = "Local bioarea (something/μl)",
title = "Disturbance = high",
linetype = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_linetype_discrete(labels = c("large isolated",
"large connected to large",
"large connected to small")) +
geom_vline(xintercept = first_perturbation_day,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
### --- BOXPLOTS --- ###
ds_biomass %>%
filter(disturbance == "low") %>%
filter(patch_size == "L") %>%
ggplot(aes(x = day,
y = bioarea_per_volume,
group = interaction(day,eco_metaeco_type),
fill = eco_metaeco_type)) +
geom_boxplot() +
labs(x = "Day",
y = "Local bioarea (something/μl)",
title = "Disturbance = low",
fill = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("large isolated",
"large connected to large",
"large connected to small")) +
geom_vline(xintercept = first_perturbation_day,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
local_large_high_plot = ds_biomass %>%
filter(disturbance == "high") %>%
filter(patch_size == "L") %>%
ggplot(aes(x = day,
y = bioarea_per_volume,
group = interaction(day,eco_metaeco_type),
fill = eco_metaeco_type)) +
geom_boxplot() +
labs(x = "Day",
y = "Local bioarea (something/μl)",
title = "Disturbance = high",
fill = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_fill_discrete(labels = c("large isolated",
"large connected to large",
"large connected to small")) +
geom_vline(xintercept = first_perturbation_day + 0.7,
linetype="dotdash",
color = "grey",
size=0.7) +
labs(caption = "Vertical grey line: first perturbation")
local_large_high_plot
Does M have an effect?
full_model = lmer(bioarea_per_volume ~
metaecosystem_type * disturbance +
(1 | system_nr) +
(1 | day),
data = ds_biomass %>%
filter (eco_metaeco_type == "L" |
eco_metaeco_type == "L (L_L)") %>%
filter(time_point >= 2),
REML = FALSE)
no_metaeco_type_model = lmer(bioarea_per_volume ~
disturbance +
(1 | system_nr) +
(1 | day),
data = ds_biomass %>%
filter (eco_metaeco_type == "L" |
eco_metaeco_type == "L (L_L)") %>%
filter(time_point >= 2),
REML = FALSE)
anova(full_model, no_metaeco_type_model)
## Data: ds_biomass %>% filter(eco_metaeco_type == "L" | eco_metaeco_type == ...
## Models:
## no_metaeco_type_model: bioarea_per_volume ~ disturbance + (1 | system_nr) + (1 | day)
## full_model: bioarea_per_volume ~ metaecosystem_type * disturbance + (1 | system_nr) + (1 | day)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_metaeco_type_model 5 3879.5 3896.9 -1934.7 3869.5
## full_model 7 3876.5 3900.9 -1931.2 3862.5 6.9701 2 0.03065
##
## no_metaeco_type_model
## full_model *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Yes.
Does M have an effect?
full_model = lmer(bioarea_per_volume ~
metaecosystem_type * disturbance +
(1 | system_nr) +
(1 | day),
data = ds_biomass %>%
filter (eco_metaeco_type == "L" |
eco_metaeco_type == "L (S_L)") %>%
filter(time_point >= 2),
REML = FALSE)
no_metaeco_type_model = lmer(bioarea_per_volume ~
disturbance +
(1 | system_nr) +
(1 | day),
data = ds_biomass %>%
filter (eco_metaeco_type == "L" |
eco_metaeco_type == "L (S_L)") %>%
filter(time_point >= 2),
REML = FALSE)
anova(full_model, no_metaeco_type_model)
## Data: ds_biomass %>% filter(eco_metaeco_type == "L" | eco_metaeco_type == ...
## Models:
## no_metaeco_type_model: bioarea_per_volume ~ disturbance + (1 | system_nr) + (1 | day)
## full_model: bioarea_per_volume ~ metaecosystem_type * disturbance + (1 | system_nr) + (1 | day)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_metaeco_type_model 5 2566.9 2582.2 -1278.4 2556.9
## full_model 7 2559.7 2581.2 -1272.9 2545.7 11.151 2 0.00379
##
## no_metaeco_type_model
## full_model **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Yes.
Does M have an effect?
full_model = lmer(bioarea_per_volume ~
metaecosystem_type * disturbance +
(1 | system_nr) +
(1 | day),
data = ds_biomass %>%
filter (eco_metaeco_type == "L (L_L)" |
eco_metaeco_type == "L (S_L)") %>%
filter(time_point >= 2),
REML = FALSE)
no_metaeco_type_model = lmer(bioarea_per_volume ~
disturbance +
(1 | system_nr) +
(1 | day),
data = ds_biomass %>%
filter (eco_metaeco_type == "L (L_L)" |
eco_metaeco_type == "L (S_L)") %>%
filter(time_point >= 2),
REML = FALSE)
anova(full_model, no_metaeco_type_model)
## Data: ds_biomass %>% filter(eco_metaeco_type == "L (L_L)" | eco_metaeco_type == ...
## Models:
## no_metaeco_type_model: bioarea_per_volume ~ disturbance + (1 | system_nr) + (1 | day)
## full_model: bioarea_per_volume ~ metaecosystem_type * disturbance + (1 | system_nr) + (1 | day)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## no_metaeco_type_model 5 3835.5 3852.9 -1912.8 3825.5
## full_model 7 3832.2 3856.6 -1909.1 3818.2 7.2508 2 0.02664
##
## no_metaeco_type_model
## full_model *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Yes.
ds_biomass %>%
filter ( disturbance == "low") %>%
filter(metaecosystem == "no") %>%
group_by (system_nr, day, patch_size) %>%
summarise(mean_bioarea_per_volume_across_videos = mean(bioarea_per_volume)) %>%
ggplot (aes(x = day,
y = mean_bioarea_per_volume_across_videos,
group = system_nr,
fill = system_nr,
color = system_nr,
linetype = patch_size)) +
geom_line () +
labs(x = "Day",
y = "Regional bioarea (something/µl)",
title = "Disturbance = low",
fill = "System nr",
linetype = "") +
scale_y_continuous(limits = c(0, 6250)) +
scale_x_continuous(limits = c(-2, 30)) +
scale_colour_continuous(guide = "none") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_linetype_discrete(labels = c("large isolated",
"medium isolated",
"small isolated"))
ds_biomass %>%
filter ( disturbance == "high") %>%
filter(metaecosystem == "no") %>%
group_by (system_nr, day, patch_size) %>%
summarise(mean_bioarea_per_volume_across_videos = mean(bioarea_per_volume)) %>%
ggplot (aes(x = day,
y = mean_bioarea_per_volume_across_videos,
group = system_nr,
fill = system_nr,
color = system_nr,
linetype = patch_size)) +
geom_line () +
labs(x = "Day",
y = "Regional bioarea (something/µl)",
title = "Disturbance = low",
fill = "System nr",
linetype = "") +
scale_y_continuous(limits = c(0, 6250)) +
scale_x_continuous(limits = c(-2, 30)) +
scale_colour_continuous(guide = "none") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
scale_linetype_discrete(labels = c("large isolated",
"medium isolated",
"small isolated"))
ds_biomass %>%
filter(disturbance == "low") %>%
filter(metaecosystem == "no") %>%
ggplot(aes(x = day,
y = bioarea_per_volume,
group = interaction(day, patch_size),
fill = patch_size)) +
geom_boxplot() +
labs(title = "Disturbance = low",
x = "Day",
y = "Local bioarea (something/μl)",
fill = "") +
scale_fill_discrete(labels = c("isolated large", "isolated medium", "isolated small")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6))
ds_biomass %>%
filter(disturbance == "high") %>%
filter(metaecosystem == "no") %>%
ggplot(aes(x = day,
y = bioarea_per_volume,
group = interaction(day, patch_size),
fill = patch_size)) +
geom_boxplot() +
labs(title = "Disturbance = high",
x = "Day",
y = "Local bioarea (something/μl)",
fill = "") +
scale_fill_discrete(labels = c("isolated large", "isolated medium", "isolated small")) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6))
We want to know if there was a systematic bias in the evaporation of different treatments (disturbance, patch size) and whether evaporation changed across time. My expectation would be that we would see a difference among the exchanges 2,3 and the exchanges 4,5,6. This is because in exchange 2,3 cultures were microwaved in 15 tubes for 3 minutes and in exchange 4,5,6 cultures were microwaved in 4 tubes for only 1 minute.
#Columns: exchange & evaporation
ds_for_evaporation = gather(ds_for_evaporation,
key = exchange,
value = evaporation,
water_add_after_t2:water_add_after_t6)
ds_for_evaporation[ds_for_evaporation == "water_add_after_t2"] = "2"
ds_for_evaporation[ds_for_evaporation == "water_add_after_t3"] = "3"
ds_for_evaporation[ds_for_evaporation == "water_add_after_t4"] = "4"
ds_for_evaporation[ds_for_evaporation == "water_add_after_t5"] = "5"
ds_for_evaporation[ds_for_evaporation == "water_add_after_t6"] = "6"
ds_for_evaporation$evaporation[ds_for_evaporation$exchange == 2] = ds_for_evaporation$evaporation[ds_for_evaporation$exchange == 2] / 2 #This is because exchange contained the topping up of two exchanges
ds_for_evaporation$evaporation[ds_for_evaporation$exchange == 2] = ds_for_evaporation$evaporation[ds_for_evaporation$exchange == 2] + 2 #We need to add 2 ml to the evaporation that happened at the exchange events 1 and 2. This is because we already added 1 ml of water at exchange 1 and 1 ml of water at exchange 2.
#Column: nr_of_tubes_in_rack
ds_for_evaporation$nr_of_tubes_in_rack[ds_for_evaporation$exchange == 1] = 15
ds_for_evaporation$nr_of_tubes_in_rack[ds_for_evaporation$exchange == 2] = 15
ds_for_evaporation$nr_of_tubes_in_rack[ds_for_evaporation$exchange == 3] = 15
ds_for_evaporation$nr_of_tubes_in_rack[ds_for_evaporation$exchange == 4] = 4
ds_for_evaporation$nr_of_tubes_in_rack[ds_for_evaporation$exchange == 5] = 4
ds_for_evaporation$nr_of_tubes_in_rack[ds_for_evaporation$exchange == 6] = 4
ds_for_evaporation %>%
filter(disturbance == disturbance) %>%
ggplot(aes(x = as.character(nr_of_tubes_in_rack),
y = evaporation)) +
geom_boxplot() +
labs(x = "Number of tubes in rack",
y = "Evaporation (ml)")
ds_for_evaporation %>%
filter(disturbance == disturbance) %>%
ggplot(aes(x = as.character(patch_size),
y = evaporation)) +
geom_boxplot() +
labs(x = "Patch size",
y = "Evaporation (ml)")
ds_for_evaporation %>%
filter(disturbance == disturbance) %>%
ggplot(aes(x = as.character(day),
y = evaporation)) +
geom_boxplot() +
labs(x = "Day",
y = "Evaporation (ml)")
ds_for_evaporation %>%
filter(disturbance == disturbance) %>%
ggplot(aes(x = disturbance,
y = evaporation)) +
geom_boxplot() +
labs(x = "Disturbance",
y = "Evaporation (ml)")
It seems like there is no real difference across time, disturbance, or patch type. However, we could also run a mixed effect model to show that they do not.
It gives me the following error:
mixed.model = lmer(evaporation ~
patch_size * disturbance * exchange +
(exchange | culture_ID),
data = ds_for_evaporation,
REML = FALSE,
control = lmerControl (optimizer = "Nelder_Mead"))
null.model = lm(evaporation ~
1,
data = ds_for_evaporation)
anova(mixed.model, null.model)
culture_info = read.csv(here("data", "PatchSizePilot_culture_info.csv"), header = TRUE)
load(here("data", "morphology", "t0.RData"));t0 = morph_mvt
load(here("data", "morphology", "t1.RData"));t1 = morph_mvt
load(here("data", "morphology", "t2.RData"));t2 = morph_mvt
load(here("data", "morphology", "t3.RData"));t3 = morph_mvt
load(here("data", "morphology", "t4.RData"));t4 = morph_mvt
load(here("data", "morphology", "t5.RData"));t5 = morph_mvt
load(here("data", "morphology", "t6.RData"));t6 = morph_mvt
load(here("data", "morphology", "t7.RData"));t7 = morph_mvt
rm(morph_mvt)
datatable(culture_info[,1:10],
rownames = FALSE,
options = list(scrollX = TRUE),
filter = list(position = 'top',
clear = FALSE))
### --- Tidy t0 - t7 data-sets --- ###
#Column: time
t0$time = NA
t1$time = NA
#Column: replicate_video
t0$replicate_video[t0$file == "sample_00001"] = 1
t0$replicate_video[t0$file == "sample_00002"] = 2
t0$replicate_video[t0$file == "sample_00003"] = 3
t0$replicate_video[t0$file == "sample_00004"] = 4
t0$replicate_video[t0$file == "sample_00005"] = 5
t0$replicate_video[t0$file == "sample_00006"] = 6
t0$replicate_video[t0$file == "sample_00007"] = 7
t0$replicate_video[t0$file == "sample_00008"] = 8
t0$replicate_video[t0$file == "sample_00009"] = 9
t0$replicate_video[t0$file == "sample_00010"] = 10
t0$replicate_video[t0$file == "sample_00011"] = 11
t0$replicate_video[t0$file == "sample_00012"] = 12
t1$replicate_video = 1 #In t1 I took only 1 video/culture
t2$replicate_video = 1 #In t2 I took only 1 video/culture
t3$replicate_video = 1 #In t3 I took only 1 video/culture
t4$replicate_video = 1 #In t4 I took only 1 video/culture
t5$replicate_video = 1 #In t5 I took only 1 video/culture
t6 = t6 %>% rename(replicate_video = replicate)
t7 = t7 %>% rename(replicate_video = replicate)
### --- Create ds_body_size dataset --- ###
long_t0 = t0 %>% slice(rep(1:n(), max(culture_info$culture_ID)))
ID_vector = NULL
ID_vector_elongating = NULL
for (ID in 1:max(culture_info$culture_ID)){
ID_vector = rep(ID, times = nrow(t0))
ID_vector_elongating = c(ID_vector_elongating, ID_vector)
}
long_t0$culture_ID = ID_vector_elongating
t0 = merge(culture_info,long_t0, by="culture_ID"); rm(long_t0)
t1 = merge(culture_info,t1,by="culture_ID")
t2 = merge(culture_info,t2,by="culture_ID")
t3 = merge(culture_info,t3,by="culture_ID")
t4 = merge(culture_info,t4,by="culture_ID")
t5 = merge(culture_info,t5,by="culture_ID")
t6 = merge(culture_info,t6,by="culture_ID")
t7 = merge(culture_info,t7,by="culture_ID")
ds_body_size = rbind(t0, t1, t2, t3, t4, t5, t6, t7); rm(t0, t1, t2, t3, t4, t5, t6, t7)
### --- Tidy ds_body_size data-set --- ###
#Column: day
ds_body_size$day = ds_body_size$time_point;
ds_body_size$day[ds_body_size$day=="t0"] = "0"
ds_body_size$day[ds_body_size$day=="t1"] = "4"
ds_body_size$day[ds_body_size$day=="t2"] = "8"
ds_body_size$day[ds_body_size$day=="t3"] = "12"
ds_body_size$day[ds_body_size$day=="t4"] = "16"
ds_body_size$day[ds_body_size$day=="t5"] = "20"
ds_body_size$day[ds_body_size$day=="t6"] = "24"
ds_body_size$day[ds_body_size$day=="t7"] = "28"
ds_body_size$day = as.numeric(ds_body_size$day)
#Column: time point
ds_body_size$time_point[ds_body_size$time_point=="t0"] = 0
ds_body_size$time_point[ds_body_size$time_point=="t1"] = 1
ds_body_size$time_point[ds_body_size$time_point=="t2"] = 2
ds_body_size$time_point[ds_body_size$time_point=="t3"] = 3
ds_body_size$time_point[ds_body_size$time_point=="t4"] = 4
ds_body_size$time_point[ds_body_size$time_point=="t5"] = 5
ds_body_size$time_point[ds_body_size$time_point=="t6"] = 6
ds_body_size$time_point[ds_body_size$time_point=="t7"] = 7
ds_body_size$time_point = as.character(ds_body_size$time_point)
#Column: eco_metaeco_type
ds_body_size$eco_metaeco_type = factor(ds_body_size$eco_metaeco_type,
levels=c('S', 'S (S_S)', 'S (S_L)', 'M', 'M (M_M)', 'L', 'L (L_L)', 'L (S_L)'))
#Select useful columns
ds_body_size = ds_body_size %>%
select(culture_ID,
patch_size,
disturbance,
metaecosystem_type,
mean_area,
replicate_video,
day,
metaecosystem,
system_nr,
eco_metaeco_type)
#Reorder columns
ds_body_size = ds_body_size[, c("culture_ID",
"system_nr",
"disturbance",
"day",
"patch_size",
"metaecosystem",
"metaecosystem_type",
"eco_metaeco_type",
"replicate_video",
"mean_area")]
datatable(ds_body_size,
rownames = FALSE,
options = list(scrollX = TRUE),
filter = list(position = 'top',
clear = FALSE))
## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html
I am here creating 12 size classes as in Jacquet, Gounand, and Altermatt (2020). However, for some reason it seems like our body size classes are really different.
#### --- PARAMETERS & INITIALISATION --- ###
nr_of_size_classes = 12
largest_size = max(ds_body_size$mean_area)
size_class_width = largest_size/nr_of_size_classes
size_class = NULL
### --- CREATE DATASET --- ###
size_class_boundaries = seq(0, largest_size, by = size_class_width)
for (class in 1:nr_of_size_classes){
bin_lower_limit = size_class_boundaries[class]
bin_upper_limit = size_class_boundaries[class+1]
size_input = (size_class_boundaries[class] + size_class_boundaries[class + 1])/2
size_class[[class]] = ds_body_size%>%
filter(bin_lower_limit <= mean_area) %>%
filter(mean_area <= bin_upper_limit) %>%
group_by(culture_ID,
system_nr,
disturbance,
day,
patch_size,
metaecosystem,
metaecosystem_type,
eco_metaeco_type,
replicate_video) %>% #Group by video
summarise(mean_abundance_across_videos = n()) %>%
group_by(culture_ID,
system_nr,
disturbance,
day,
patch_size,
metaecosystem,
metaecosystem_type,
eco_metaeco_type) %>% #Group by ID
summarise(abundance = mean(mean_abundance_across_videos)) %>%
mutate(log_abundance = log(abundance)) %>%
mutate(size_class = class) %>%
mutate(size = size_input) %>%
mutate(log_size = log(size))
}
ds_classes = rbind(size_class[[1]], size_class[[2]], size_class[[3]], size_class[[4]],
size_class[[5]], size_class[[6]], size_class[[7]], size_class[[8]],
size_class[[9]], size_class[[10]], size_class[[11]], size_class[[12]],)
datatable(ds_classes,
rownames = FALSE,
options = list(scrollX = TRUE),
filter = list(position = 'top',
clear = FALSE))
#Trying out gganimate, but I can't seem to manage to install transformr packaget
p = list()
n = 0
first_level = c("isolated small", "isolated small", "isolated large", "isolated large")
second_level = c("small connected to small", "small connected to small", "large connected to large", "large connected to large")
third_level = c("small connected to large", "small connected to large", "large connected to small", "large connected to small")
for (patch_size_input in c("S", "L")){
for(disturbance_input in c("low", "high")){
n = n + 1
title = paste0(patch_size_input,
' patches, Disturbance = ',
disturbance_input,
', Day: {round(frame_time, digits = 0)}')
p[[n]] <- ds_classes %>%
filter(disturbance == disturbance_input) %>%
filter(patch_size == patch_size_input) %>%
ggplot(aes(x = log_size,
y = log_abundance,
group = interaction(log_size, eco_metaeco_type),
color = eco_metaeco_type)) +
geom_point(stat = "summary", fun = "mean") +
geom_line(stat = "summary", fun = "mean", aes(group=eco_metaeco_type)) +
scale_color_discrete(labels = c(first_level[n],
second_level[n],
third_level[n])) +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6)) +
labs(title = title,
x = 'Log size (μm2)',
y = 'Log abundance + 1 (indiv/μm2)',
color = "") +
transition_time(day) +
ease_aes('linear')
animate(p[[n]],
duration = 10,
fps = 25,
width = 500,
height = 500,
renderer = gifski_renderer())
anim_save(here("gifs",
paste0("transition_day_",
patch_size_input,"_",
disturbance_input,
".gif")))
}
}
Regional biomass production (mean bioarea density between two patches) in meta-ecosystems of the same total area, but whose two patches have either the same size or that have a smaller and larger patch. For clarity, only the low disturbance treatment is shown here. See the Appendix for equivalent the figure of the high disturbance treatment.
Local biomass production (bioarea density) in patches that are either isolated, connnected to a patch of the same size or to a patch of a larger size. For clarity, only the low disturbance treatment is shown here. See the Appendix for equivalent the figure of the high disturbance treatment.
Local biomass production in patches that are either isolated, connected to the same size, or connected to a smaller size. For clarity, only the high disturbance treatment is shown here. See the Appendix for equivalent the figure of the low disturbance treatment.
small_patches_figure + labs(title = "")
Small patches this time in logRR
evaporation.test = read.csv(here("data", "evaporation_test","evaporation_test_right.csv"), header = TRUE)
evaporation.test %>%
ggplot(aes (x = as.character(water_pipetted),
y = weight_water_evaporated,
group = interaction(water_pipetted, as.character(rack)),
fill = as.character(rack))) +
geom_boxplot() +
labs(x = "Water volume (ml)" ,
y = "Evaporation (g)",
fill = "Rack replicate")
evaporation.test = read.csv(here("data", "evaporation_test", "evaporation_test_fill_nofill.csv"), header = TRUE)
evaporation.test %>%
ggplot(aes (x = all_tubes_water,
y = weight_water_evaporated)) +
geom_boxplot() +
labs(x = "Water in the other 10 tubes" ,
y = "Evaporation (g)",
caption = "When all tubes were filled, they were filled with 6.75 ml of deionised water (I think, but I need to check in my lab book.")
## Time difference of 43.23396 secs
To build the mixed effect models we will use the R package lme4. See page 6 of this PDF to know more about the syntaxis of this package.
To do model diagnostics of mixed effect models, I’m going to look at the following two plots (as suggested by Zuur et al. (2009), page 487):
Quantile-quantile plots
Partial residual plots
The effect size of the explaining variables is calculated in the
mixed effect models as marginal and conditional r squared. The marginal
r squared is how much variance is explained by the fixed effects. The
conditional r squared is how much variance is explained by the fixed and
the random effects. The marginal and conditional r squared are
calculated using the package MuMIn. The computation is
based on the methods of Nakagawa, Johnson, and
Schielzeth (2017). For the coding and interpretation of these r
squared check the documentation
for the r.squaredGLMM function
See for the interaction syntaxis this link.
I am starting from the largest model and then simplifying because … (see statistical modelling course at ETH).
I am going to select the best model according to AIC. BIC is better for understanding and AIC for predicting. Halsey (2019) also suggests this approach instead of p values. I’m going to use AIC because I’m interested in knowing how much meta-ecosystem type contributed to the overall regional biomass.